MASSACHUSETTS INSTITUTE OF TECHNOLOGY 
ARTIFICIAL INTELLIGENCE LABORATORY 

A.I. Memo No. 614 


6 June 1981 


Equation Counting and the Interpretation 

of Sensory Data 

W.A. Richards, J.M. Rubin & D.D. Hoffman 


ABSTRACT 

Many problems in biological information processing require the solution to a complex system or 
equations in many unknown variables. An equation-counting procedure is described for determining 
whether such a system of equations will indeed have a unique solution, and under what conditions 
the solution should be interpreted as "correct". Three examples of the procedure are given for 
illustration, one for auditory signal processing and two from vision. 


Arknnwledaement: This report describes research in Natural Computation at 
the Artificial Intelligence Laboratory and Psychology Department of the 
Massachusetts Institute of Technology. Support for this research is provided 
in part by the Advanced Research Projects Agency of the Department of defense 
under Office of Naval Research contract N00014-80-C-0505.and by NSF and AFOS 
under grant 7923110-MCS. D.D. Hoffman was also supported by a Hughes Aircraft 
Company graduate fellowship and J.M. Rubin was supported by an NSF Fell o' p. 

We greatly appreciate the help of Professor M. Artin and T. . .. f , 

pointed us to the Bezout Theorem and the Jacobian test, and also the insightful 
Sents of Professor S. Ullman and A. Pentland. Technical assistance in the 
preparation of this manuscript was provided by C.J. Papineau. 

© MASSACHUSETTS INSTITUTE OF TECHNOLOGY 1981 




WAR, JMR & DDH 2 EQUATION COUNTING 

1. Introduction 

Sensory data are routinely interpreted as external events by biological systems. This achievement 
is the classical problem of perception: given a pattern of sensory activity, what are the external 
events that caused this activity? In order for an organism to survive, such assignments of cause, or 
interpretations, must be reliable and appropriate. Yet the sensory data by themselves are ambiguous 
(as illustrated by the projection of the three-dimensional world onto our two-dimensional retina). The 
appropriate interpretation of a pattern of activity is thus just one of many possibilities. The objective 
of this paper is to outline the power and pitfalls of an equation-counting procedure, and how this 
procedure can lend insight into the interpretation process. 

The ambiguity of the sensory activity becomes very clear when formal relations are developed 
between these sense data (the givens or “knowns”) and the external events (or “unknowns”) that 
generate the data (Marr, 1976, 1982; Ullman, 1979). When such relations are expressed in the form 
of equations relating the “knowns” to the “unknowns”, then the number of unknowns will almost 
always exceed the number of equations. The incompleteness of the set of equations is a consequence 
of the fact that the mapping of a world event into the sensor entails a loss of information and hence is 
usually many-to-one. But if the system of equations is incomplete, with the number of equations less 
than the number of unknowns, then the system cannot be solved uniquely and constructing a unique 
description of the external event becomes impossible. 

Fortunately, events in the real world arc not arbitrary, but are constrained by natural laws. The 
sense data reflect these constraints (Huffman, 1971; Clowes, 1971; Waltz, 1975). Once discovered, 
these additional relationships can yield the remaining equations needed to make the number of equa¬ 
tions equal to the number of unknowns. A unique solution to the set of equations may then be sought, 
permitting an interpretation of the data, (The correctness or validity of the interpretation will be 
discussed later.) 

The paper begins with a rather simple example of “equation-counting,” namely, the detection 
of a narrow-band signal in noise. This problem involves only linear equations, but still illustrates 
the general features of the approach and raises three issues: 1) independence of the equations; 2) 
constraints needed to yield a unique solution, and 3) whether this unique solution is indeed “correct”. 
We then introduce a theorem by Bezout which is needed to place bounds on the number of possible 
solutions to polynomial equations, as well as a Jacobian test for the independence of these equations. 
Finally, two other problem examples are given to illustrate further details. One example concerns 
recovering structure from visual motion; the other shows why three spectral samples are needed to 
distinguish shadows from reflectance changes. 
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Figure 1. An illustration of a narrow band signal against a a background of noise. The noise is broadband 


with a constant time averaged spectrum. 


2. A Classic Problem 

A problem faced by many animals is the need to isolate a narrow-band, species-specific signal 
from the background noise. Although examples may be found in every sense modality, the clearest 
probably occur in audition. Consider die bird listening to die call of its mate in the forest of other 
sounds; the dog perking his cars at his master’s whistle; or the moth's task of isolating the cry of the 
bat as it homes in for its next meal. In each case, the signal is confined to a relatively narrow band, as 
illustrated in Figure 1, whereas the competing noise is much broader. Given diat the frequency band 
of die signal is known (as it would be for die bird or die moth), how many intensity samples must be 

taken to isolate the signal from the noise? 

Clearly, by referring to Figure 1, we sec that sampling in the signal-band at frequency 1 will not 
allow us to isolate the signal. More formally, the car will receive intensity /, at frequency f\ equal to 
the sum of the power produced by each source: 


/(/i) = S{fi) + N{f: i) 


( 1 ) 


where S corresponds to the power of the narrow-band signal at f\ and N is the background noise at 
the same frequency. Since only I is available to die listener, 5 and N cannot be separated, for we 
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have only one equation in two unknowns, S and N. More generally, if we allow additional samples at 
time intervals tj, then equation (1) can be generalized to: 

( 2 ) 

Thus, for T time samples we will obtain T equations in 2 T unknowns, which will not permit a unique 
solution for S. 

Let us now make the obvious next step and consider frequency samples outside the signal band. 
The frequency f\ in equation (2) then becomes indexed to f x . However, since the signal is zero outside 
the band at/i, then S{f it tj) = 0 for i 1. These conditions may be expressed as two equations: 

I(fi, tj) = S(f u tj) + N(f it tj) (3a) 

i • 

tj) — 0, (t 1) (3b) 

Letting F and T be the number of frequency and time samples, respectively, there will be a total of 
F ■ T equations of form (3a) and (F — 1) • T equations of form (3b). The total number of equations 
is thus 2 • FT — T. Similarly, the total number of unknowns will be F • T for S and F • T for N or 
2 . F • f. In order to solve uniquely for S, the minimum condition is that the number of equations E 
equal (or exceed) tire number of unknowns U: 

* 

E>V (4) 

For solution, equations (3a,b) thus must pass th'e following inequality test: 


2 FT 


T > 2FT 



or 

0> T 

which fails since T > 1. Thus a narrow-band signal cannot be extracted from tire broad-band noise 
without specifying further constraints upon either the signal or the noise. 


2.1 Flat Noise Condition 

Very often noise is relatively constant over frequency (or time), for example, the hum of an air 
conditioner, a steady wind How passing the body, or e\cn body noise. I bis condition can be expressed 
by tire following relation: 
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( 6 ) 

where t ^ 1 and f\ serves as the reference frequency. We now see that for a total of F frequency 
samples, equation (6) adds (F — 1) • T equations but no more unknowns. Applying the Inequality 
Test (4), we now find: 


or 


(2 FT -T) + T(F — 1) > 2 FT 


(7) 


FT>2T 


or 

F > 2 (8) 

Thus, the minimum condition for a unique solution occurs for two frequency samples at any 
temporal interval. Ignoring the time variable, equations (3a, b), and (6) then become 

m = S(f { ) + M/i) (9) 

m = s{f 2 ) + N(f 2 ) 

S(/ 2 ) = 0 

N{fi) = N(k) . 

We now have four equations in four unknowns, which allows us to solve for 5(/ t ), given that the 
noise spcctaim is flat. 


2.2 Independence and Uniqueness 

Although two frequency samples plus the constraint of “flat noise” yield the same number of 
equations as unknowns, these equations must be shown to be independent. Certainly we can reduce 
equations (9) to obtain an explicit solution for £'(/i), thereby demonstrating independence. However 
in the more complex cases normally encountered, such a reduction is often difficult or may be impos¬ 
sible (for example if fifth degree polynomials arc involved). We therefore seek a more general test for 
independence. 
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In the above example, the obvious test is to recast equations (9) so all the unknowns are on the right 
hand side (R.H.S.) of the equality, and all the knewns are in the L.H.S. Then the determinant of the 
coefficients of the R.H.S. can be calculated. By “Cramer’s Rule”, we know that if this determinant 
is not zero, then the equations have a unique solution (Thomas, 1951). To proceed, equations (9) 
are rearranged so the unknowns are ordered in the sequence S(/j), N(/i), S(/ 2 ), N(fi) and are each 
aligned in their separate columns on the R.H.S. of the equality. Since there are four unknowns and 
four equations, the matrix of the coefficients of the unknowns will be as follows: 

1 1 0 0 

0 0 11 

0 0 10 

0 —1 0 1 

The determinant of this matrix is easily found to be 1 (i.e., it has maximum rank), and hence the set 
of equations (9) must have a unique solution. 

We now can proceed with confidence to find the following solution for S(/i): 

s(A )=/(/,) - m (id 



2.3 Corroboration and Constraint 

Unfortunately, any pair of sensory intensities 7(/j) and will provide a value for S(fi). How do 
we know, therefore, that tire obtained value for 5(/i) is indeed correct? Clearly if the noise stimulus 
is not fiat over frequency, but varies as shown in Fig. 1, then the solution for S(/i) will be wrong 
because die assumed condition does not apply. Without some evidence supporting the “flat noise” 
assumption, a meaningful interpretation of the intensity values /(/ 1 ), I(fz) cannot be made. 

Ideally, any assumed condition, such as the flat noise condition, that is introduced to match the 
number of equations to the unknowns should be a regularity in die world or a “law” diat is never (or 
rarely) broken by nature. Such conditions arc difficult to discover, but when found and introduced 
into die system of equations provide powerful const/aims on die solutions. Often die contraint may be 
a statistical regularity (Witkin, 1980; Pentland, 1980). Poor choices for constraints are those conditions 
that arc very narrow and restrictive and which do not capture a very general property of the world. 


In die case of detecting a narrow-band signal in “flat-noise”, the imposed condition is very restric¬ 
tive. However, some attempt can be made to verify the validity of invoking diis condition. For 
example, one possibility might be to examine other frequencies to see if the relation N(/|) — N{fi) 
holds for a range of frequencies outside die signal band. (Note that die solution for S(/i) should 
also hold.) If so, then the chance that the “flat-noise” condition is invalid is reduced, although the 
uncertainty is never eliminated. Sampling at additional frequencies thus provides some (weak) cor¬ 
roboration for the interpretation, increasing its likelihood. (In fact, the condition assumed here has 
merely been replaced bv another, less restrictive assumption about the smoothness of waveforms.) 
Stronger forms of corroboration vv ill be discussed in later sections. 
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Finally, it should be noted that in cases where the imposed conditions are not verifiable, the ap¬ 
propriateness of the condition can often be rejected quite easily. For example, if 5(/0 is found to be 
negative, then since negative signals are not physically realizable, the assumption must not be valid. 
This strategy of rejecting certain conditions or possible states of the environment has been found 
useful elsewhere (Rubin and Richards, 1981). 


3. Non-Linear (Polynomial) Equations 
3.1 Bczout’s Theorem 

In the above example, all of the equations were linear, and simple techniques of linear algebra 
could be used. What if one or more of the equations were quadratic or a still higher degree polyno¬ 
mial? In such cases, which are quite common, each nth order polynomial will at most have n distinct 
roots. How many possible solutions will there be if there are M polynomial equations of degree N? 
Can we even guarantee that there will in fact be a finite set of solutions? If this cannot be guaranteed, 
then the test that states the number of equations E should at least equal the number of unknowns U 
is not useful, and the simple equation-counting procedure collapses at the onset. Fortunately, Bezout s 
Theorem tells us under what conditions a finite set of solutions can be found to N equations in N 
unknowns, and just what the maximum number of solutions will be (Van dcr Waerden, 1940). 


Theorem (Bezout): A set of N independent polynomial equations in N variables will have a 
maximum number of generic solutions equal to the product of the degrees of the equations. 1 

The above theorem is critical for our procedure because it states that if die relations among the N 
variables can be cast as N independent polynomial equations (perhaps by a change in the form of the 
varablcs), then there will be a finite set of isolated solution points. Furthermore, this set will include 
all the possible solutions. (See Appendix II for a brief discussion of a generalization of Bezout s 
Theorem by Sard to include any set of smooth functions on manifolds.) bor linear equations, it is 
clear that the product of the degrees of the equations will always be one, and only one solution set will 
be found. For third order equations, which may include terms such as x ■ y ■ z, or y 2 • z, the number 
of possible N-tuples of variables that satisfy the N equations can be quite high. Among these is the 
physically meaningful solution that we seek, provided our hypotheses are correct. 


3.2 The Jacobian Test 

Bczout’s ITicorcm states that in principle, N polynomial equations of any degree can provide a 
solution to N unknowns, if die equations arc independent. In our simple first example, die deter- 


1 By a generic solution, wc mean that a slight 
appreciably (as would be the case if the solution 


tcrturbalion in the \nines of the variables will not alter the solution 
were the special case of two nicies just grazing each other rather han 


intersecting, for example). 
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minant of the matrix of coefficients of the unknowns was used to check for independence. More 
generally, the Jacobian of the set of equations should be evaluated (Kendig, 1977; Guillemin and 
Pollack, 1974). The Jacobian is formed by taking all N partial derivatives of each of the N equations 
(dfi/dii, df 2 /dx 2 , .. Sf n ldx n \ and placing these partial derivatives in an N X N matrix, where 
the columns represent each unknown and the rows correspond to the equations. Clearly, for linear 
equations, the Jacobian is simply the matrix of the coefficients of the unknowns of each equation. 


Jacobian Test (for Independence): If the determinant of the Jacobian of the system of N equations 
in N unknowns is non-zero, then a countable set of isolated solution points can be found. 


This test is simply an application of the Inverse Function Theorem, which gives a condition for a 
one-to-one and onto mapping between real variables. Note that if the determinant of the Jacobian 
collapses to zero (by a loss of rank), then this is not a proof that solution points cannot be found. The 
Jacobian test is therefore a test for sufficiency, not necessity. 


3.3 Summary of Procedure 

To apply the “equation counting” method to the recovery of event descriptions from limited sen¬ 
sory data, we therefore proceed as follows: 

1. Set up polynomial equations describing the mapping of the external (unknown) variables into 
the (known) sense data. 

2. Embody as many constraints as necessary in the form of additional polynomial equations relating 
die variables in order that the total number of equations equals the number of unknowns diat are to 
be recovered. Whenever possible, choose “constraints” diat can be verified from the data. Those that 
capture a regular or consistent property of the world are the best choice. 

3. Apply die Jacobian test to demonstrate that the equations arc independent. Bezout’s Theorem 
then guarantees that die re will be a finite number of solution points. If die Jacobian test fails, try to 
discover new constraints. (See also Section 5.6.) 

4. Proceed to solve for the variables of interest. (We know of no simple heuristics for this step.) 

5. Demonstrate diat all constraints and conditions are valid. Usually this will involve taking an 
extra, independent measurement and verifying that the same solution is obtained. Some care must be 
taken with this step, however, as will be seen in die examples to follow. 


6. lhc sense data may now be given a preliminary interpretation. However, a final interpretation 
should await tw'o further tests to be described subsequently. One is die exclusion of competing inter¬ 
pretations, the other is corroboration, using an independent system of equations. (Sec Sections 6.0 and 
6.1.) 
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4. Two Examples 


4.1 Example T. Recovering Structure from Motion 

The difference in visual impressions between a static scene and a dynamic movie is often quite 
striking. Somehow the motion created by viewing a rapid sequence of frames will transform an am¬ 
biguous 2-D shape into a vivid 3-D structure. Perhaps the most common example of this phenomenon 
occurs when we walk, run, or drive and immediately know the spatial configuration of the objects 
about us, regardless whether we use two eyes or one. Although Ullman (1979) has shown how the 
spatial relations may be recovered using motion information in the general case, we wish to consider 
a simpler version of the same problem that has a more compact solution: namely, given a person in 
locomotion, how can he recover the orientation of the surface on which he walks? 

Let the surface be covered with markings, or for convenience, let a short “stick” lie on the surface 
patch of particular interest. Then if the observer looks at the center of the “stick” as he moves ahead, 
the image of the “stick” as seen on his retina will rotate and change length as shown in frames FI, F2, 
andF3 of Fig. 2. Because the stick lies in a plane of fixed orientation relative to the moving observer, 
the orientation of the surface patch can be specified by the axis of rotation of the “stick”. The problem 
then is equivalent to recovering the axis of rotation of a rotating rod seen by a stationary observer. 

Figure 2 illustrates the general form of this common problem. The “stick” or rod is rotating in 
3-spacp and is projected onto a single 2-D retina. Let each of these retinal images be discrete time 
samples or frames as in a TV. Given only the three (or more) ambiguous 2-D image frames FI, F2, 
F3, how can the axis of rotation of the rod be recovered? This is a task that is solved easily by the 
human observer, although no information other than the 2-D motion of the end points of the rod is 
available (Johansson, 1975). 

The inset to Figure 2 shows the actual three-dimensional relation between the viewer, the rotating 
rod, and the axis about which the rod is spinning. Note that the axis of rotation (which defines the 
surface plane) can be any stationary vector and need not be vertical nor parallel to the xy image 
plane. The problem is to recover the correct axis of rotation (as well as the length of the rod). 


4.2 Rigid Rod and Rotation in a Plane (F) 

Let the coordinate system be centered at the projection of the midpoint of tire rod. Then since the 
distance OA = OA', we need consider the motion of OA only. Let the three-dimensional coordinates 
of end A be (zj, yi,z{) for frame 1 and (z„ y,, z,) for frame i. Then since the “stick” is a rigid rod, we 
have the constraint that the rod length remains constant for any frame: 


A + y\ + A = A + v] -f A 


(13) 


For N frames, the relation (13) will yield [N — 
(since x t , y, are observables in the image plane), 
unknowns. 


1) equations, each in two unknowns, z\ and z, 
So far we thus have (N — 1) equations in N 
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Letting the end position of the unit axial vector N have the coordinates xo, yo, zq, equation (14a) 
reduces to 


X{ • *0 4* Vi ' VO "I - z i ■ %3 — k COS 0 


(14b) 


where k — (x] + y\ + zj) l/2 . 

But rotation in a plane requires that the angle 0 between the axis N and OA be tt/ 2. Hence, cos 
0 = 0 and the value of k is irrelevant. For N frames, relation (14b) thus gives us N equations in 
three more unknowns; x$, yo, zo. However, because the length of the rotation axis is irrelevant also, N 
can be taken as the unit vector and we obtain the additional equation 

*o + Vo + 4 — 1 ^ l4c ) 

Altogether, we thus have (/V — 1) -f- N -j- 1 equations ( E ) in M -f- 3 unknowns ( U ). xo, yj, 

zo. (Note that all of these equations are polynomials.) The minimum number of equations can then be 

determined from the relation E > U: 


or 


2N > N + 3 



N> 3 


4.3 The Jacobian Test 

The next step is to demonstrate that the equations (13) and (14) form a set of independent equa¬ 
tions. We thus examine the Jacobian for N — 3 to see if its rank is maintained. Recalling that x u y { 
for t 7 ^ 0 are given in the image plane, the partial derivatives ofz; in equation (13) for i — 2, 3 yield 
the first two rows of the following matrix, while the remaining rows come from from equations (14b) 

and (14c) respectively: 


2z\ 

-2zi 

0 

0 

0 

0 

2zi 

0 

—223 

0 

0 

0 

zo 

0 

0 


yi 

Zl 

0 

zo 

0 

*2 

V2 

Zi 

0 

0 

zo 

X3 

Vo 

Z-3 

0 

0 

0 

2xu 

2yo 

2 2 d 


Evaluation of the determinant by MACSYM A shows that it is generally non-zero. However, certain 
relations between the variables may cause the Jacobian to drop rank. Some of those failme conditions 
can be noted by factoring the determinant. (Note that such failure conditions provide instances where 
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any perceptual system that interprets data in accord with the system of equations should also fail. The 
factors thus provide example experiments for “instant psychophysics”.) 


4.4 Be/out’s Theorem and Uniqueness 

« 

Although the set of equations (13) and (14) are shown to be “independent” by the Jacobian test, 
Bezout’s Theorem tells us that we may have up to 2 6 = 64 possible solutions. (This is the product of 
the degrees of the six equations). Which of these solutions do we pick? 

Fortunately, it can be shown by algebraic reduction of the six equations that of these 64 possible 
solutions, only two have real values-and one of these is simply a “reflection” of the other about the 
image plane (Hoffman and Flinchbaugh, 1981). 2 Thus, three snapshots or “frames” showing the x, y 
positions of the end points of a rotating rod are sufficient to solve for the rod length and its axis. (The 
reflection causes an ambiguity only in the direction of motion and orientation of the rod.) But since 
any triplet of x, y positions will yield a solution, how do we know that the measurements were taken 
from a rotating rod and not from a random set of points? Clearly additional tests must be performed 
before any meaningful interpretation can be given to the data. 


4.5 Corroboration 

In addition to the problem of isolating a unique solution point, it is also necessary to show that the 
“unique” solution is indeed plausible. (If the unique solution is not physically realizable, it can be 
rejected immediately.) In the case of the rod rotating in a plane about a fixed axis, three frames (or 
snapshots) were sufficient to solve the six polynomial equations and to obtain a unique solution for 
the rod’s lengh and its axis of rotation. However, arc we guaranteed that no other set of conditions 
could generate the data? Clearly not, for if die simple rod rotation is simulated in the laboratory on 
a TV monitor, then one obvious interpretation is that there are two points moving on the face of the 
TV. (In fact, if reflections appear on the screen so that strong 3-D cues are present, then the illusion of 
a rod rotating in 3-D is lost.) 

Before a final intciprctation should be made, it is therefore prudent to corroborate the solution 
to increase die probability for a correct interpretation. This can be accomplished by analyzing an 
independent set of data or hypotheses that are based on entirely distinct physical constraints. (In 
die case of structure from motion, stcreopsis may be used.) Without such corroboration, the human 
observer seems to accept the interpretation that is most favored by the real-world statistics. 3 

2 In the event that algebraic reduction is not possible, then the uniqueness of a solution can be tested by generating 
data from several known, but arbitrary configurations, and by numerical evaluation determine if the correct solution is 
obtained (Ullman, personal communication). Numerical evaluation Is recommended in any case as a further check for 
the isolation of solution points. 

? In the rotating rod case where the screen or reflections aie not visible, then because there is no contrary 3-1) information, 
the 3 1) interpretation will be accepted as most likely. 
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5. Hidden Dependencies 

Quite often when the equation-counting method is used, the constraint equations contain hidden 
dependencies that cause the Jacobian to drop rank and its determinant to equal zero. There are two 
general procedures for handling this situation so that an interpretation of the data can be made. The 
first is simply to introduce another independent constraint, the second is to identify the dependency 
and to reduce the number of physical variables accordingly. The disambiguation of shadows and 
highlights illustrates these two methods. 


5.1 Example 2: Interpreting Shadows and Highlights 

Consider the very common situation in vision when two patches of surface A and B appear 
superficially different. Do A and B differ because they have different reflectances, or is one of the 
regions a highlight or a shadow on a surface of uniform reflectance? llicse two interpretations are 
different, since when B is a shadowed region, the implication is that there is an object occluding the 
direct light of the source, whereas in the highlight case, the difference between A and B is due to 
the specular properties of the surface and there is no cast shadow. 4 (If the darker region around the 
highlight were to be regarded as shadowed, then 99 per cent of the world would be interpreted as 
lying in shade!) 

As shown in Figure 3, let the observer view the surface from above, and let the surface be il¬ 
luminated with at least two sources of illumination-one producing direct light, as from a sun, while 
the other source is diffuse, such as that characteristic of the sky and clouds. 

We proceed by noting that the only information available to the viewer is the image intensities 
I A , Ib from the two regions A and J3. For simple Lambertian conditions, these image intensities will 
be the product of the strength of illumination times the reflectances of the surface material. Let the 
reflectance common to A and B be Rx where the subscript X indicates/? is a function of wavelength, 
and let Sx be the incident flux from the direct light of the sun and Dx the flux arising from the diffuse 
light from the sky, both of which arc also functions of wavelength as indicated by the subscript. 5 If a 
region is neither highlighted nor shadowed, then the image intensity I will be given by 

7 = (S x +Dx)Rx (17a) 

Equation (17a) thus describes the image intensity resulting from an unshadowed, matte surface. 




5.2 The Highlight Case 

4 Note that for this analysis we are ignoring other distinctive features of a highlight: 1) the textural aspect of specularity, 
2) its directional component which produces a disparity between the two eyes, and 3) that highlight edges are convex 
whereas shadow edges tend to be straight or concave. 

5 A planar surface is assumed; the ell eel of surface orientation on the source illumination can be considered incorporated 
into ,S\ and D\. 
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Figure 3. Direct and diffuse light illuminate the surface. Is region A a highlight or is region B in shadow? 
Possible image intensities over wavelength are illustrated in the lower pair of graphs. 


If region A is the same flat surface as region B, except that it has a highlight, then B remains 
matte and hi is defined by equation (17a). On die other hand, equation (17a) will not apply to 
the highlighted region A, which acts like a partial mirror reflecting some fraction of the illuminated 
scene lying away from the viewer. The reflectance R\ will thus depend in part upon what the viewer 
sees in the reflection off A. In the case of the normal highlight, the arrangement between the direct 


source illumination £\, the surface, and the viewer is such that only the source light is reflected off 


the viewed surface and hence R\ - ! and D\ = 0 (for the highlight only). '1 his contribution from the 
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reflected light to the image intensity fa is at the expense of the matte component of surface reflectance 

(Evans,1948; Horn,1977). Thus, if the highlight contribution to the image intensity is the fraction fa, 

then the matte contribution will be (1 — fu). To characterize the image intensity Ia corresponding to 

a partial highlight on region A, we may thus reduce the matte equation (17a) by the factor (1 — fa) 

and add to it the complementary fraction fa of specular light: 

* 

fax = faSx + (1 - /*)(Sx + AOfl* (17b) 

where the first term on the R.H.S. is the specular component and the second term is the matte com¬ 
ponent of the highlight. Note that only the illuminant S\ appears in the specular term becasue of the 
directional properties of die reflections off a highlighted region. 6 


5.3 The Shadow Case 

If region B is the same surface as A, but B is in shadow, then region B will be illuminated only 
by the diffuse light D\. The effect of shadowing is thus to reduce the illumination from (S\ -f- £>x) 
to D\. Recognizing that shadows often have penumbrac, we may let fs be the fraction of the total 
illumination that contributes to die shaded region. For shadow, dierefore, equation (17a) may be 
modified as follows: 


/bx = fs(Sx + Dx)Rx + (1 - fsWh (18a) 

which further simplifies to 

fax — (/A + Dx)Rx (18b) 

For complete shade, fs = 0 and the image intensity fa\ arising from regions is described only by 
the product of the diffuse light dmes the reflectance. For no shade, fs — 1; and for the penumbrae, 
fs lies between 0 and 1. 


5.4 Preliminary Equation Counting 

Equations (17b) and (18a) may be combined to obtain a single equation that describes the image 
intensity for both the highlight and shadow conditions. This can be accomplished quite easily by 
replacing die matte component in die highlight equation (17b) by the shadow relation of (17c). After 
simplification, the resulting single equation will be 

h = IuS> + (1 - M/A + Ih)lh 

6 Nole (hat the equation describing the highlight condition is similar to that used for transparency. 


(19) 
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where the X subscript indicates a wavelength dependency, and fn and fs are respectively the highlight 
and shadow fractions. 

If h, }h, fs are now indexed to indicate the spatial region, we can apply the standard equation- 
counting procedure to determine the minimum number of wavelength and spatial samples needed to 
solve for the physical variables S\, R\, D\, fm and fcs in terms of the known /,x, and then attempt to 
determine whether the solution for these physical variables implies a shadow or highlight. 

Unfortunately, the equation-counting procedure is unsatisfactory in this case for two reasons. First, 
the minimum number of spatial and spectral samples is biologically unfeasible (5 and 5 or 6 and 4, 
respectively); second, and more important, the Jacobian collapses. The collapse is due to hidden 
dependencies in the set of equations of the form (19). 


5.5 Eliminating Dependencies 

The most obvious strategy for eliminating dependencies among equations is to search for other 
independent relations or constraints. Often, this may be difficult, and a more desirable course is to try 
to reduce the number of unknowns by combining some of the physical variables whose solution is not 
critical to the interpretation. For example, if the pairs S\R\ and D\R\ occur together everywhere, 
then We might consider replacing each pair by a single variable. Such a reduction would not affect the 
ability to distinguish a shadow from a highlight. Each of these two procedures will now be illustrated. 


5.6 Solving for the Highlights by Adding Constraints 

To introduce additional independent constraining relations, we will consider tire two-dimensional 
case as shown in Figure 4 where a highlight (or shadow) runs across a change in reflectance Ri,R%. 
The highlight boundary is parallel to the Y axis; the reflectance change is parallel to the X axis. For 
this two-dimensional case, equation (19) will assume the following form: 

Ixy\ = fxLy\ + (1 — /x)My\ (20) 


where /vrx is the image intensity corresponding to one of Live regions A\,B{, <7| or A 2) Rz, Gi. Note 
that since only two wavelength variables L\ and M\ arc involved along the X axis, these variables 
need to be indexed by Y only. 


By simple equation-counting, it can be verified that the minimum number of samples along X or Y 
and for X will be respectively either 3,1,3 or 3,3,1. (Note that Y and X appear together and hence can 
be symmetrically indexed). A further reduction can be obtained by noting that region C\ or C 2 , etc. is 
always matte, and hence f • (or / r is zero. Thus lew = Af> The minimum for X, Y, X is tlien 
3, 1, 2 or 3, 2, 1, which correspond to a set of six equations in six unknowns. The determinant of Live 
Jacobian of cither system of equations is still zero, however. 
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Figure 4. View of a surface with a shadow or highlight boundary parallel to the Y axis and crossing a 
region of two different reflectances, R\ and R2 


To solve the equations, we need to introduce one more constraint or reduce the number of vari¬ 
ables. For highlights, an additional constraint can be added by noting that the spectral composition 
of the purely specular component is independent of the underlying reflectance R \, R 2 . Thus along Y, 
Li\ — Lj\. The minimum X, Y,\ samples are now X = 2, X = 1 , Y = 2 (the symmetry between 
Y and X has been removed by the specularity constraint), leading to the following equations: 

hi = JbL 1 + (1 - Ib)M x 

Ib2 = fBL 2 + (1 — A)A^2 


hi — fcL x -\- (1 — fc)M 1 



hi — fc h -(- (1 — fo)Xf2 








WAR, JMR & DDH 



EQUATION COUNTING 


where the indexing is for Y only, since there is only a single wavelength sample. 

The Jacobian of the reduced set of the above' equations obtained by substituting A = L 2 and 
f e = 0 is: 


L\ — Mi f& (1 —/b) 

L\ — M% is 0 

0 0 1 

0 0 0 



(1 -fa) 

0 

1 


= J— M\) 


which is non-singular provided Afj 7 ^ M 2 and /g^ 0. 

Thus solutions can be obtained for f B , M u M 2 and particularly A, the specular component of the 
light reflected off the surface. 


. _ feJci — IbJc2 

Locular - {hi _ Ic2) _ (Jbi _ J m) 


( 22 a) 


1 — /b = 


Ibi — /b2 
Ici — Ic 2 


( 22 b) 


5.7 Solving for Shadows by Combining Variables 

Returning to Figure 4 , we may now reinterpret the regions A\, A 2 , Bi, B 2 in ternis of a shadow 
edge parallel to the T-axis. (A penumbra will be needed for this constraint implying that the mini¬ 
mum Spatial samples along X is three although only two will be used as in the highlight case.) 

For shadows, the equation (19) then has the same form as the first four equations (21), with L,- — 
( 5 ^ D^Ri and M, = DiRu where S and D are respectively the source and diffuse light and R is the 
reflectance. Since for shadows Li 7 ^ L 2 (i.e., there is no spectral component superimposed on A\, A 2 , 
or B u B 2 ), an additional constraining equation must replace this specular constraint. For illustration, 
we will introduce a “gray world” condition, namely that the average of all surfaces reflecting the 
source light is spectrally flat. Hence the diffuse light A is simply some fraction 7 of the source light: 

A = iSi (23a) 

and 

Mi = 7S.A (23b) 


Li = (!-)- 7)5,7?., 


(23c) 
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Because S and R appear ««ed«r In wo,<**<chminalc this depend- 

rr:“:r w / rsr—*. 

10 ,= m+ i)s.*+(> - a*** = (/B+l)Sl * 

fe = /b( 1 + iVS + (1 - A)T S a “ (/s + ^ 

Jci = tSi*(= M) 


7c2 = ^*(= M *) 


(24) 


dependencies are still present. 


(/s + i) 

■ : • 


o 

i 

0 


0 

Ub 4- 1) 
o 

i 


S\ 

si 

0 

0 


51 
5* 2 
5; 

52 


0 


Rather than introducing a new constraint, we will are 

variables can be combined to reduce futt ert cm re me coefficients of the variables 

ratios or products of the entries in die Jacob,an a ra . ^ be uscd * muldply two 

in the original set of equations, and consefium, we m cxplonng various triangular forms of 

of the equations to eliminate one vana . ( ■ c ratjos are ^us those between the rows 

the matrix of rank one less than the original) PP muUipHc d to eliminate the variable 

in me same columns, because it is these factors tot w " be ia ,c ratios of the above 

d,at is identified with tha, column of die acobiar, ™mx. Thus *e , a „d 2, and Sj/Sj. 

Jacobian that should be explored firstan-Ife + ^ ' (24) shows that die solution for diese 

Which appear in columns 3 and 4. Inspection oi 4 

reduced variables is quite simple 


S\ 

SI 


Ijn 

Il32 


Ici _■ S\Ri 

Ic2 SjP-2 


(26a) 


/b + T 

1 


Ijn 

Ic 1 


Ijn 

Ic2 


(26b) 


1 

1 thp /innondenev between tbc irnng.c inten 
‘ hc ' cterion 
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IbJc2 — IcJb2 


(26c) 


which is common to both (26a) and (26b). If the grey world condition applies and if C(y) is a shadow 
on B[X), then the shadow relation (26c) will be true. 

Unfortunately, there are an unlimited number of image intensity values that will satisfy the 
“shadow” relation (26c). How are we to be sure that they all correspond to the shadow condition and 
not to a reflectance change or even a highlight? To answer this question, we proceed in two stages, first 
to show that the shadow solution (26) never will correspond to a highlight, and hence shadows and 
highlights are at least disambiguated because their solutions arc distinct. Then, we will illustrate how 
the probability of other confounding spectral relations such as different materials can be set arbitrarily 
low by independent corroboration of the original solution. 


6. Distinctness of S and H Solutions 
(Exclusion of Competing Interpretations) 

Our basic procedure to prove distinctness of the shadow S and highlight H solutions will be to 
show that there is at least one relation between the four available image intensities (/si, Ib 2 , kh kn) 
that has different values for the shadow and highlight conditions. These values will always be different 
(if the constraints are valid) because the relation corresponds to two different physical variables (one 
for shadow, the other for highlights) that have non-overlapping values. 

To proceed, we ask first what highlight conditions satisfy the shadow solution (26). (Subsequently, 
we will examine the opposite case—asking what shadow conditions will “look like” highlights.) 7 We 
thus assume relation (26) holds and solve for one of the highlight conditions. Consider equation 
(22a) that specifics the magnitude of the specular components of the highlight. Note the numerator 
is identical to the shadow equation (26) if the left hand side (L.H.S.) of (26) is subtracted from the 
R.H.S. In this case, however, the numerator (22a) will be zero. Hence the shadow condition requires 
that L apccu iar = 0 and consequently there can be no highlight interpretation. Thus, given that the 
shadow condition (26) holds, there will be no highlight interpretation. 

To check for the reverse case, namely under what conditions the image intensity relations for the 
highlight condition will also yield a shadow interpretation, we may examine the second highlight 
equation (22b). In particular, we wish to solve for the physical interpretation of the intensity relations 
of (22b) given a shadow condition. This can be accomplished simply by substituting equations (24) 
into the R.H.S. of (22b). We find that, given the shadow conditions, then 


Im — hn _ /n-M ^ h , j 
Ici — Jc2 1 T 


(27) 


Figure 5 now plots the possible values of the image intensity ratio given by the I.HS of (27) for 
shadows and the RHS of (22b) for highlights. 

7 j-'or another example treatment, sec hitmans (1979) analysis of false-targets for his structurc-froni-motion thcotems. 
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We note that both / (the fraction of specularity or shadow) and 7 (the fraction of direct light), 
range between 0 and 1. Hence for highlights, 1 — / must lie between 0 and 1, whereas for shadows 
1 + f/l W 'H be greater than or equal to 1. The only common condition is when / = 0, which 
corresponds to a homogeneous matte area. Thus, highlights and shadows will never be confused from 
the image intensities (provided the gray world assumption applies), if the calculation given by the 
L.H.S. of (27) is made. It is of some interest that this operation on image intensities is equivalent to 
examining the output of the double-opponent color cell found in most biological color vision systems 
(see Rubin and Richards, 1981). 


6.1 Corroboration 


Although the highlight H and shadow S solutions are unique and distinct, it is still possible that 
other properties of surfaces, such as pigment density changes or changes in reflectances could satisfy 
equations (22) or (25) and be misinterpreted as either a highlight H, or shadow S. Thus a shadow 
or highlight interpretation should not yet be given to die solutions H and S. To exclude all other pos¬ 
sibilities is difficult (see Rubin and Richards, 1981, however). Nevertheless, die odds for an incorrect 
H or S interpretation can be reduced by applying an independent test for the validity of the shadow or 
highlight equations. Wp call such a procedure “corroboration”. 

One simple independent corroborative test is to note whether the equation counting procedure 
suggested more than one minimal condition for solution. In particular, we noted in section 5.5 that 
die equation (20) had a symmetry in wavelength (X) and space (T). We chose as a starting point ofie 
spectral sample and two samples in die Y dimension. An independent test would therefore be to use 
two spectral samples rather than one, and only one sample in die Y dimension. This case corresponds 
to examining the gradients of a highlight, or die penumbra of a shadow. 


A second and more common type of corroborating procedure is to simply take another set of 
measurements independent of the first, and determine whether the solutions for the physical constants 
remain the same or not. If they do not, then the interpretation must be rejected. If they are confirmed, 
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then the odds on a misinterpretation are reduced. Ideally, the corroboration should be based upon 
measurements taken from a physical dimension different from that used in the original solution. In 
any case, since we are corroborating the value of a physical parameter, the corroborating measure¬ 
ments must not be confounded with the dimensions of that physical parameter. In this respect, the 
relation (27) that tests for the highlight or shadow condition is most satisfactory, for the values fg and 
7 are dimensionless and are not functions of wavelength, for example. For the shadow condition, 
we thus can take a third spectral sample Ib 3 , Ic 3 and substitute these image intensities for Ib 2 , Jc 2 - 
Since the physical constant (fa + 7)/7 of equation (25b) is not a function of wavelength, this value 
should remain unchanged if the image intensity changes are indeed due to a shadow. In effect, we 
are confirming that the S solution point remains fixed along the solution ray illustrated in Fig. 5. If it 
does, then the shadow (or highlight) interpretation is reaffirmed and the chance of misinterpretation 
is unlikely provided that the competing interpretations are not processes that behave like shadows. 
Consequently, at least three wavelength samples are required before a reliable shadow interpretation 
can be made. 

In the case of recovering structure from motion-our earlier example-the corroboration of the axis 
of rotation could entail adding additional frames or snapshots to sec if the same axis and rod length 
is recovered. Clearly, this procedure is not entirely independent because the strategy for solution 
remains the same and some possible confounding interpretations may not be excluded (e.g., the 
correct interpretation that the points are on a TV monitor in 2-D). 

A fiibie independent corroborative test would be to use stereopsis, for this computation of the 
depth relations between the feature points is quite different from the structure-from-motion analysis. 
This ideal corroborative procedure sfiould thus use an entirely different computational analysis, which 
is based upon relations that have quite different failure conditions . 8 


7. Summary 

Although die equation-counting procedure has been used in the past to give some insight into the 
complexity required to solve problems in many non-linear variables (e.g., Leith el al, 1981), research¬ 
ers in perception have often neglected to recognize that certain other conditions must be ftillfilled 
before a meaningful solution can be guaranteed (Meiri, 1980). These conditions are summarized in 
the flow diagram of Fig. 6 . They include the Jacobian test for the independence of the system of equa¬ 
tions, uniqueness of solution, exclusions of competing interpretions, and two kinds of corroboration. 
If these conditions can be met, then the equation counting procedure provides a powerful theoretical 
tool for understanding how, in principle, biological systems can make reliable interpretations and 
assertions from the greatly impoverished sense data available to them. 


8 For biological systems, we probably should view "corroboration’' as an early step in the perceptual process (perhaps 
at the lev e! of Maris 2-1/2D sketch) that acts on the output of modules anal wing inhumation derived from mo ion, 
disparity, color, tcvlurc, etc., as well as non-visual information, such as tactile roughness, shape or even in some eases, 
acoustic information. 
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Figure 6 . Outline of Steps in ‘Hquation-Counting’ Procedure. 
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Appendix I: Redundancy 

Unfortunately, due to measurement and sampling errors, real-world data are not precise. The 
hardware performing the calculations may also be quite noisy, as is the case for many neural net¬ 
works. Without exact data and calculations, solution vectors will not be completely isolated, but 
rather are more properly represented as a probability distribution about the exact solution point. To 
reduce the likelihood of misinterpretation, several overconstraining equations are often helpful. (By 
“overconstraining” we here mean the inclusion of equations in addition to those needed to obtain a 
unique solution.) Their value will depend in part upon how many variables (unknowns) are included 
in the solution point. Intuitively, the more the unknowns, the greater the potential noise and die less 
the contribution of any one overconstraining equation will be. To capture this property, we suggest 
the following measure of the redundancy of a system containing overconstraining equations. 


Redundancy =1 — [1 — jj] c (Al) 

where C is die number of independent combinations of the equations and U is the number of un¬ 
knowns. As U increases, tliis measure decreases to zero. The effect of the additional overconstraining 
equations, on the other hand, is to reduce the deleterious effect of increasing U in a manner analogous 
to probability summation, yet die redundancy measure will never exceed 1 (the ideal). The redun¬ 
dancy measure has the practical value of providing an estimate of how many extra equations (or data 
samples) are needed to isolate a solution point to a certain probability, given known measurement 

signal to noise ratios. 


Appendix II: Sstrd’s Theorem for non-Polynomial functions 

In many cases, die equations relating the unknown variables will not be polynomial and Bezout s 
Theorem will not apply. These exceptions include such common functions as exponentials, logarith¬ 
mic, or trigonometric. Sometimes, a change of variables can be made to recast the non-polynomial 
relations in polynomial form. If this is done, then care must be taken to restrict the range over which 

the polynomial form applies. 

More generally, if a function is smooth on a manifold, dien Sard’s Theorem can be used (Guillcmin 
and Pollack, 1974; Milnor,1978). Suppose that die following system of independent equations holds: 

fi[xi,...,x k ) = pi 


fn{X \> ■ ■ ■, Xk) Pn 

r l his system can then be represented more generally as a mapping from R k to R n : 
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F:i?* hh R n 


or 


F(xi, . . .Xfc) — {/i(*l. * • •**)} 

By Sard’s Theorem, we know that if F is a smooth mapping and if F is invcrtable for the values p, 
then the dimension ofF *(p) is ( k — n). Since when k — n the dimension ofF *(p) is zero, there 
can be at most a countable number of (isolated) solutions. 

Some care must be taken in assuming that Sard’s Theorem applies to any differentiable function. 
It does not. For example, consider the simple periodic function sin x. Such a function is uniquely 
invertable only over a specified range. Polynomial functions are thus a "safer" class of functions to use 
for equation counting, for their appropriate range is usually more obvious. 
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